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Pyruvate Carboxylase from 
Corynebacterium glutamicum 

STATEMENT OF GOVERNMENT 
RIGHTS IN THE INVENTION 

Part of the work performed during development of this invention utilized 
U.S. Government funds. The U.S. Government has certain rights in this invention. 

BACKGROUND OF THE INVENTION 

Field of the Invention 



15 The present invention relates to a Corynebacterium glutamicum pyruvate 

carboxylase protein and to polynucleotides encoding this protein. 

Background Information 

20 Pyruvate carboxylate is an important anaplerotic enzyme replenishing 

oxaloacetate consumed for biosynthesis during growth, or lysine and glutamic acid 
production in industrial fermentations. 

The two-step reaction mechanism catalyzed by pyruvate carboxylase is 
shown below: 

25 w A . Mg 2+ acetyI-CoA 

MgATP + HC0 3 + ENZ-biotin MgADP + Pi + ENZ-biotin-C0 2 (1) 

ENZ-biotin-C0 2+ Pyruvate ENZ-biotin + oxaloacetate (2) 

In reaction (1 ) the ATP-dependent biotin carboxylase domain carboxylates 
a biotin prosthetic group linked to a specific lysine residue in the biotin-carboxyl-carrier 
30 protein (BCCP) domain. Acetyl-coenzyme A activates reaction ( 1 ) by increasing the rate 
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of bicarbonate-dependent ATP cleavage. In reaction (2), the BCCP domain donates the 

CO, to pyruvate in a reaction catalyzed by the transcarboxylase domain (Attwood, P.V., 

Int. J. Biochem. Cell. Biol. 27:231-249 (1995)). 

Pruvate carboxylase genes have been cloned and sequenced from: 
5 Rhizobium etli (Dunn, M.F., et al., J. Bacteriol. 775:5960-5970 (1996)), Bacillus 

stearothermophilus (Kondo, H., et al., Gene 797:47-50 (1997), Bacillus subtillis 

(Genbank accession no. Z97025), Mycobacterium tuberculosis (Genbank accession no. 

Z83018), and Methanobacterium thermoautotrophicum (Mukhopadhyay, B., J. Biol. 

Chem. 273:5155-5166 (1998). Pyruvate carboxylase activity has been measured 
10 previously in Brevibacterium lactofermentum (Tosaka, O., et al., Agric. Biol. Chem. 

43: 1 5 1 3-1 5 1 9 (1 979)) and Corynebacterium glutamkum (Peters- Wendisch, P.G., et al, 

Microbiology 143: 1095-1 103 (1997)). 

Previous research has indicated that the yield and productivity of the 

aspartate family of amino acids depends critically on the carbon flux through anaplerotic 
15 pathways (Vallino, J.J.,&Stephanopoulos,G.,7i/o/ecA«o/. Bioeng. 47:633-646 (1993)). 

On the basis of the metabolite balances, it can be shown that the rate of lysine production 

is less than or equal to the rate of oxaloacetate synthesis via the anaplerotic pathways. 



SUMMARY OF THE INVENTION 



20 



The present invention provides isolated nucleic acid molecules 
comprising a polynucleotide encoding a pyruvate carboxylase polypeptide having 
the amino acid sequence in Figure 1 (SEQ ID NO:2) or the amino acid sequence 
encoded by the cosmid clone deposited in a bacterial host as ATCC Deposit 

25 Number . The nucleotide sequence determined by sequencing the deposited 

pyruvate carboxylase cosmid clone, which is shown in Figure 1 (SEQ ID NO:l), 
contains an open reading frame encoding a polypeptide of 1 1 40 amino acid residues 
which has a deduced molecular weight of about 123.6 kDa. The 1 140 amino acid 
sequence of the predicted pyruvate carboxylase protein is shown in Figure 1 and in 

30 SEQ ID NO:2. 

Thus, one aspect of the invention provides an isolated nucleic acid 
molecule comprising a polynucleotide having a nucleotide sequence selected from 
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the group consisting of: (a) a nucleotide sequence encoding the pyruvate 
carboxylase polypeptide having the complete amino acid sequence in SEQ ID 
NO:2; (b) a nucleotide sequence encoding the pyruvate carboxylase polypeptide 
having the complete amino acid sequence encoded by the cosmid clone contained 

5 in ATCC Deposit No. ; and (c) a nucleotide sequence complementary to any 

of the nucleotide sequences in (a) or (b) above. 

Further embodiments of the invention include isolated nucleic acid 
molecules that comprise a polynucleotide having a nucleotide sequence at least 90% 
identical, and more preferably at least 95%, 97%, 98% or 99% identical, to any of 
10 the nucleotide sequences in (a), (b) or (c) above, or a polynucleotide which 
hybridizes under stringent hybridization conditions to a polynucleotide having a 
nucleotide sequence identical to a nucleotide sequence in (a), (b) or (c), above. The 
polynucleotide which hybridizes does not hybridize under stringent hybridization 
conditions to a polynucleotide having a nucleotide sequence consisting of only A 
15 residues or of only T residues. 

The present invention also relates to recombinant vectors which include 
the isolated nucleic acid molecules of the present invention and to host cells 
containing the recombinant vectors, as well as to methods of making such vectors 
and host cells and for using them for production of pyruvate carboxylase 
20 polypeptides or peptides by recombinant techniques. 

The invention further provides an isolated pyruvate carboxylase 
polypeptide having amino acid sequence selected from the group consisting of: (a) 
the amino acid sequence of the pyruvate carboxylase polypeptide having the amino 
acid sequence shown in Figure 1 (SEQ ID NO:2); and (b) the amino acid sequence 
25 of the pyruvate carboxylase polypeptide having the complete amino acid sequence 

encoded by the cosmid clone contained in ATCC Deposit No. . The 

polypeptides of the present invention also include polypeptides having an amino 
acid sequence with at least 90% similarity, more preferably at least 95% similarity 
to those described in (a) or (b) above, as well as polypeptides having an amino acid 
30 sequence at least 70% identical, more preferably at least 90% identical, and still 
more preferably 95%, 97%, 98% or 99% identical to those above. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 shows the nucleotide (SEQ ID NO: 1) and deduced amino acid 
(SEQ ID NO:2) sequences of the complete pyruvate carboxylase protein determined 

5 by sequencing of the DNA clone contained in ATCC Deposit No. . The 

protein has sequence of about 1 140 amino acid residues and a deduced molecular 
weight of about 123.6 kDa. 

DETAILED DESCRIPTION OF THE INVENTION 

10 

The present invention provides isolated nucleic acid molecules 
comprising a polynucleotide encoding the pyruvate carboxylase protein having the 
amino acid sequence shown in Figure 1 (SEQ ID NO:2) which was determined by 
sequencing a cloned cosmid. The pyruvate carboxylase protein of the present 

15 invention shares sequence homology with M. tuberculosis and human pyruvate 
carboxylase proteins. The nucleotide sequence shown in Figure 1 (SEQ ID NO: 1) 
was obtained by sequencing cosmid III F10 encoding a pyruvate carboxylase 

polypeptide, which was deposited on at the American Type Culture Collection, 

10801 University Blvd., Manassas, VA 201 10-2209, and given accession number 

20 . 

Nucleic Acid Molecules 

Unless otherwise indicated, all nucleotide sequences determined by 
sequencing a DNA molecule herein were determined using an automated DNA 

25 sequencer (such as the ABI Prism 377), and all amino acid sequences of 
polypeptides encoded by DNA molecules determined herein were predicted by 
translation of a DNA sequence determined as above. Therefore, as is known in the 
art for any DNA sequence determined by this automated approach, any nucleotide 
sequence determined herein may contain some errors. Nucleotide sequences 

30 determined by automation are typically at least about 90% identical, more typically 
at least about 95% to at least about 99.9% identical to the actual nucleotide 
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sequence of the sequenced DNA molecule. The actual sequence can be more 
precisely determined by other approaches including manual DNA sequencing 
methods well known in the art. As is also known in the art, a single insertion or 
deletion in a determined nucleotide sequence compared to the actual sequence will 
5 cause a frame shift in translation of the nucleotide sequence such that the predicted 
amino acid sequence encoded by a determined nucleotide sequence will be 
completely different from the amino acid sequence actually encoded by the 
sequenced DNA molecule, beginning at the point of such an insertion or deletion. 

Unless otherwise indicated, each "nucleotide sequence" set forth herein 
10 is presented as a sequence of deoxyribonucleotides (abbreviated A, G , C and T). 
However, by "nucleotide sequence" of a nucleic acid molecule or polynucleotide 
is intended, for a DNA molecule or polynucleotide, a sequence of 
deoxyribonucleotides, and for an RNA molecule or polynucleotide, the 
corresponding sequence of ribonucleotides (A, G, C and U) where each thymidine 
15 deoxynucleotide (T) in the specified deoxynucleotide sequence in is replaced by the 
ribonucleotide uridine (U). For instance, reference to an RNA molecule having the 
sequence of SEQ ID NO:l set forth using deoxyribonucleotide abbreviations is 
intended to indicate an RNA molecule having a sequence in which each 
deoxynucleotide A, G or C of SEQ ID NO:l has been replaced by the 
20 corresponding ribonucleotide A, G or C, and each deoxynucleotide T has been 
replaced by a ribonucleotide U. 

Using the information provided herein, such as the nucleotide sequence 
in Figure 1, a nucleic acid molecule of the present invention encoding a pyruvate 
carboxylase polypeptide may be obtained using standard cloning and screening 
25 procedures, such as those for cloning DNAs using mRNA as starting material. The 
pyruvate carboxylase protein shown in Figure 1 (SEQ ID NO:2) is about 63% 
identical to M. tuberculosis and 44% identical to human. As one of ordinary skill 
would appreciate, due to the possibilities of sequencing errors discussed above, as 
well as the variability of cleavage sites for leaders in different known proteins, the 
30 actual pyruvate carboxylase polypeptide encoded by the deposited cosmid 
comprises about 1 1 40 amino acids, but may be anywhere in the range of 1 1 33- 1 1 47 
amino acids. 



10 
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As indicated, nucleic acid molecules of the present invention may be in 
the form ofRNA, such as mRNA, or in the form of DNA, including, for instance, 
cDNA and genomic DNA obtained by cloning or produced synthetically. The DNA 
may be double-stranded or single-stranded. Single-stranded DNA or RNA may be 
5 the coding strand, also known as the sense strand, or it may be the non-coding 
strand, also referred to as the anti-sense strand. 

By "isolated" nucleic acid molecule(s) is intended a nucleic acid 
molecule, DNA or RNA, which has been removed from its native environment. For 
example, recombinant DNA molecules contained in a vector are considered isolated 
for the purposes of the present invention. Further examples of isolated DNA 
molecules include recombinant DNA molecules maintained in heterologous host 
cells or purified (partially or substantially) DNA molecules in solution. Isolated 
RNA molecules include in vivo or in vitro RNA transcripts of the DNA molecules 
of the present invention. Isolated nucleic acid molecules according to the present 
15 invention further include such molecules produced synthetically. 

Isolated nucleic acid molecules of the present invention include DNA 
molecules comprising an open reading frame (ORF) with an initiation codon at 
positions 199-201 of the nucleotide sequence shown in Figure 1 (SEQ ID NO:l); 
DNA molecules comprising the coding sequence for the pyruvate carboxylase 
20 protein shown in Figure 1 and SEQ ID NO:2; and DNA molecules which comprise 
a sequence substantially different from those described above but which, due to the 
degeneracy of the genetic code, still encode the pyruvate carboxylase protein. Of 
course, the genetic code is well known in the art. Thus, it would be routine for one 
skilled in the art to generate the degenerate variants described above. 
25 In another aspect, the invention provides isolated nucleic acid molecules 

encoding the pyruvate carboxylase polypeptide having an amino acid sequence 

encoded by the cosmid clone deposited as ATCC Deposit No. . Preferably, 

this nucleic acid molecule will encode the polypeptide encoded by the above- 
described deposited clone. The invention further provides an isolated nucleic acid 
30 molecule having the nucleotide sequence shown in Figure 1 (SEQ ID NO: 1 ) or the 
nucleotide sequence of the pyruvate carboxylase DNA contained in the above- 
described deposited clone, or nucleic acid molecule having a sequence 
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complementary to one of the above sequences. 

In another aspect, the invention provides an isolated nucleic acid 
molecule comprising a polynucleotide which hybridizes under stringent 
hybridization conditions to a portion of the polynucleotide in a nucleic acid 
5 molecule of the invention described above, for instance, the cosmid clone contained 

in ATCC Deposit . By "stringent hybridization conditions" is intended 

overnight incubation at 42 °C in a solution comprising: 50% formamide, 5x SSC 
(150 mM NaCl, 15mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 5x 
Denhardt's solution, 1 0% dextran sulfate, and 20 ug/ml denatured, sheared salmon 
10 sperm DNA, followed by washing the filters in O.lx SSC at about 65 °C. By a 
polynucleotide which hybridizes to a "portion" of a polynucleotide is intended a 
polynucleotide (either DNA or RNA) hybridizing to at least about 15 nucleotides 
(nt), and more preferably at least about 20 nt, still more preferably at least about 30 
nt, and even more preferably about 30-70 nt of the reference polynucleotide. These 
15 are useful as diagnostic probes and primers. 

Of course, polynucleotides hybridizing to a larger portion of the reference 
polynucleotide (e.g., the deposited cosmid clone), for instance, a portion 50-750 nt 
in length, or even to the entire length of the reference polynucleotide, also useful 
as probes according to the present invention, as are polynucleotides corresponding 
20 to most, if not all, of the nucleotide sequence of the deposited DNA or the 
nucleotide sequence as shown in Figure I (SEQ ID NO.l). By a portion of a 
polynucleotide of "at least 20 nt in length," for example, is intended 20 or more 
contiguous nucleotides from the nucleotide sequence of the reference 
polynucleotide, (e.g., the deposited DNA or the nucleotide sequence as shown in 
25 Figure 1 (SEQ ID NO:l)). As indicated, such portions are useful diagnostically 
either as a probe according to conventional DNA hybridization techniques or as 
primers for amplification of a target sequence by the polymerase chain reaction 
(PCR), as described, for instance, in Molecular Cloning, A Laboratory Manual, 
2nd. edition, edited by Sambrook, J., Fritsch, E. F. and Maniatis, T., (1989), Cold 
30 Spring Harbor Laboratory Press, the entire disclosure of which is hereby 
incorporated herein by reference. 

Since a pyruvate carboxylase cosmid clone has been deposited and its 
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determined nucleotide sequence is provided in Figure 1 (SEQ ID NO: 1 ), generating 
polynucleotides which hybridize to a portion of the pyruvate carboxylase DNA 
molecule would be routine to the skilled artisan. For example, restriction 
endonuclease cleavage or shearing by sonication of the pyruvate carboxylase 
5 cosmid clone could easily be used to generate DNA portions of various sizes which 
are polynucleotides that hybridize to a portion of the pyruvate carboxylase DNA 
molecule. Alternatively, the hybridizing polynucleotides of the present invention 
could be generated synthetically according to known techniques. 

As indicated, nucleic acid molecules of the present invention which 
10 encode the pyruvate carboxylase protein polypeptide may include, but are not 
limited to those encoding the amino acid sequence of the polypeptide, by itself; the 
coding sequence for the polypeptide and additional sequences, such as a pre-, or 
pro- or prepro- protein sequence; the coding sequence of the polypeptide, with or 
without the aforementioned additional coding sequences, together with additional, 
15 non-coding sequences, including for example, but not limited to intions and non- 
coding 5' and 3' sequences, such as the transcribed, non-translated sequences that 
play a role in transcription, mRNA processing - including splicing and 
polyadenylation signals, for example - ribosome binding and stability of mRNA; 
an additional coding sequence which codes for additional amino acids, such as 
20 those which provide additional functionalities. Thus, the sequence encoding the 
polypeptide may be fused to a marker sequence, such as a sequence encoding a 
peptide which facilitates purification of the fused polypeptide. In certain preferred 
embodiments of this aspect of the invention, the marker amino acid sequence is a 
hexa-histidine peptide, such as the tag provided in a pQE vector (Qiagen, Inc.), 
25 among others, many of which are commercially available. As described in Gentz 
et al y Proc. Natl Acad. ScL. USA 56:821-824 (1989), for instance, hexa-histidine 
provides for convenient purification of the fusion protein. The "HA" tag is another 
peptide useful for purification which corresponds to an epitope derived from the 
influenza hemagglutinin protein, which has been described by Wilson et al, Cell 
30 57:767(1984). 

The present invention further relates to variants of the nucleic acid 
molecules of the present invention, which encode portions, analogs or derivatives 
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of the pyruvate carboxylase protein. Variants may occur naturally, such as a natural 
allelic variant. By an "allelic variant" is intended one of several alternate forms of 
a gene occupying a given locus on a chromosome of an organism. Genes //, Lewin, 
ed. Non-naturally occurring variants may be produced using art-known 
5 mutagenesis techniques. 

Such variants include those produced by nucleotide substitutions, 
deletions or additions. The substitutions, deletions or additions may involve one 
or more nucleotides. The variants may be altered in coding or non-coding regions 
or both. Alterations in the coding regions may produce conservative or non- 
10 conservative amino acid substitutions, deletions or additions. Especially preferred 
among these are silent substitutions, additions and deletions, which do not alter the 
properties and activities of the pyruvate carboxylase protein or portions thereof. 
Also especially preferred in this regard are conservative substitutions. Most highly 
preferred are nucleic acid molecules encoding the pyruvate carboxylase protein 
15 having the amino acid sequence shown in Figure 1 (SEQ ID NO:2). 

Also preferred are mutants or variants whereby preferably pyruvate 
carboxylase is expressed 2 to 20 fold higher than its expression in C glutamicum as well 
as feedback inhibition mutants. 

Further embodiments of the invention include isolated nucleic acid 
20 molecules comprising a polynucleotide having a nucleotide sequence at least 90% 
identical, and more preferably at least 95%, 97%, 98% or 99% identical to (a) a 
nucleotide sequence encoding the pyruvate carboxylase polypeptide having the 
complete amino acid sequence in SEQ ID NO:2; (b) a nucleotide sequence 
encoding the pyruvate carboxylase polypeptide having the complete amino acid 
25 sequence encoded by the cosmid clone contained in ATCC Deposit No. ; or 
(c) a nucleotide sequence complementary to any of the nucleotide sequences in (a) 
or(b). 

By a polynucleotide having a nucleotide sequence at least, for example, 
95% "identical" to areference nucleotide sequence encoding a pyruvate carboxylase 
30 polypeptide is intended that the nucleotide sequence of the polynucleotide is 
identical to the reference sequence except that the polynucleotide sequence may 
include up to five point mutations per each 100 nucleotides of the reference 
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nucleotide sequence encoding the pyruvate carboxylase polypeptide. In other 
words, to obtain a polynucleotide having a nucleotide sequence at least 95% 
identical to a reference nucleotide sequence, up to 5% of the nucleotides in the 
reference sequence may be deleted or substituted with another nucleotide, or a 
5 number of nucleotides up to 5% of the total nucleotides in the reference sequence 
may be inserted into the reference sequence. These mutations of the reference 
sequence may occur at the 5' or 3' terminal positions of the reference nucleotide 
sequence or anywhere between those terminal positions, interspersed either 
individually among nucleotides in the reference sequence or in one or more 
10 contiguous groups within the reference sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 90%, 95%, 97%, 98% or 99% identical to, for instance, the nucleotide 
sequence shown in Figure 1 or to the nucleotides sequence of the deposited cosmid 
clone can be determined conventionally using known computer programs such as 
15 the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, Madison, 
WI 5371 1). Bestfit uses the local homology algorithm of Smith and Waterman 
(Advances in Applied Mathematics 2: 482-489 (1981)) to find the best segment of 
homology between two sequences. When using Bestfit or any other sequence 
20 alignment program to determine whether a particular sequence is, for instance, 95% 
identical to a reference sequence according to the present invention, the parameters 
are set, of course, such that the percentage of identity is calculated over the full 
length of the reference nucleotide sequence and that gaps in homology of up to 5% 
of the total number of nucleotides in the reference sequence are allowed. 
25 Th e present application is directed to nucleic acid molecules at least 90%, 

95%, 97%, 98% or 99% identical to the nucleic acid sequence shown in Figure 1 
(SEQ ID NO: 1) or to the nucleic acid sequence of the deposited DNA, irrespective 
of whether they encode a polypeptide having pyruvate carboxylase activity. This 
is because, even where a particular nucleic acid molecule does not encode a 
30 polypeptide having pyruvate carboxylase activity, one of skill in the art would still 
know how to use the nucleic acid molecule, for instance, as a hybridization probe 
or a polymerase chain reaction (PCR) primer. 



WO 00/39305 PCT/US98/27301 

Preferred, however, are nucleic acid molecules having sequences at least 
90%, 95%, 97%, 98% or 99% identical to the nucleic acid sequence shown in 
Figure 1 (SEQ ID NO:l) or to the nucleic acid sequence of the deposited DNA 
which do, in fact, encode a polypeptide having pyruvate carboxylase protein 
5 activity. By "a polypeptide having pyruvate carboxylase activity" is intended 
polypeptides exhibiting activity similar, but not necessarily identical, to an activity 
of the pyruvate carboxylase protein of the invention as measured in a particular 
biological assay. 

Of course, due to the degeneracy of the genetic code, one of ordinary 
10 skill in the art will immediately recognize that a large number of the nucleic acid 
molecules having a sequence at least 90%, 95%, 97%, 98%, or 99% identical to the 
nucleic acid sequence of the deposited DNA or the nucleic acid sequence shown in 
Figure 1 (SEQ ID NO:l) will encode a polypeptide "having pyruvate carboxylase 
protein activity." In fact, since degenerate variants of these nucleotide sequences all 
15 encode the same polypeptide, this will be clear to the skilled artisan even without 
performing the above described comparison assay. It will be further recognized in 
the art that, for such nucleic acid molecules that are not degenerate variants, a 
reasonable number will also encode a polypeptide having pyruvate carboxylase 
protein activity. This is because the skilled artisan is fully aware of amino acid 
20 substitutions that are either less likely or not likely to significantly effect protein 
function (e.g., replacing one aliphatic amino acid with a second aliphatic amino 
acid). 

For example, guidance concerning how to make phenotypically silent 
amino acid substitutions is provided in Bowie, J. U., et aL> "Deciphering the 

25 Message in Protein Sequences: Tolerance to Amino Acid Substitutions," Science 
2^/7:1306-1310 (1990), wherein the authors indicate that there are two main 
approaches for studying the tolerance of an amino acid sequence to change. The 
first method relies on the process of evolution, in which mutations are either 
accepted or rejected by natural selection. The second approach uses genetic 

30 engineering to introduce amino acid changes at specific positions of a cloned gene 
and selections or screens to identify sequences that maintain functionality. As the 
authors state, these studies have revealed that proteins are surprisingly tolerant of 
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amino acid substitutions. The authors further indicate which amino acid changes 
are likely to be permissive at a certain position of the protein. For example, most 
buried amino acid residues require nonpolar side chains, whereas few features of 
surface side chains are generally conserved. Other such phenotypically silent 
5 substitutions are described in Bowie, J.U., et aL supra, and the references cited 
therein. 

Vectors and Host Cells 



The present invention also relates to vectors which include the isolated 
10 DNA molecules of the present invention, host cells which are genetically 
engineered with the recombinant vectors, and the production of pyruvate 
carboxylase polypeptides or portions thereof by recombinant techniques. 

Recombinant constructs may be introduced into host cells using well 
known techniques such as infection, transduction, transfection, transvection, 
1 5 conjugation, electroporation and transformation. The vector may be, for example, 
a phage, plasmid, viral or retroviral vector. 

The polynucleotides may be joined to a vector containing a selectable 
marker for propagation in a host. Generally, a plasmid vector is introduced in a 
precipitate, such as a calcium phosphate precipitate, or in a complex with a charged 
20 lipid. If the vector is a virus, it may be packaged in vitro using an appropriate 
packaging cell line and then transduced into host cells. 

Preferred are vectors comprising cis-acting control regions to the 
polynucleotide of interest. Appropriate trans-acting factors may be supplied by the 
host, supplied by a complementing vector or supplied by the vector itself upon 
25 introduction into the host. 

In certain preferred embodiments in this regard, the vectors provide for 
specific expression, which may be inducible and/or cell type-specific. Particularly 
preferred among such vectors are those inducible by environmental factors that are 
easy to manipulate, such as temperature and nutrient additives. 
30 Expression vectors useful in the present invention include chromosomal-, 

episomal- and virus-derived vectors, e.g., vectors derived from bacterial plasmids, 
bacteriophage, yeast episomes, yeast chromosomal elements, viruses such as 
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baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, 
pseudorabies viruses and retroviruses, and vectors derived from combinations 
thereof, such as cosmids and phagemids. 

The DNA insert should be operatively linked to an appropriate promoter, 
5 such as the phage lambda P L promoter, the E. coli lac, trp and tac promoters, the 
SV40 early and late promoters and promoters of retroviral LTRs, to name a few. 
Other suitable promoters will be known to the skilled artisan. The expression 
constructs will further contain sites for transcription initiation, termination and, in 
the transcribed region, a ribosome binding site for translation. The coding portion 

10 of the mature transcripts expressed by the constructs will include a translation 
initiating codon (AUG or GUG) at the beginning and a termination codon 
appropriately positioned at the end of the polypeptide to be translated. 

As indicated, the expression vectors will preferably include at least one 
selectable marker. Such markers include dihydrofolate reductase or neomycin 

15 resistance for eukaryotic cell culture and tetracycline, ampicillin. chloramphenicol 
or kanamycin resistance genes for culturing in E. coli and other bacteria. 
Representative examples of appropriate hosts include bacterial cells, such as E. coli, 
C. glutamicum, Streptomyces and Salmonella typhimuhum cells; fungal cells, such 
as yeast cells. Appropriate culture media and conditions for the above-described 

20 host cells are known in the art. 

Among vectors preferred for use in bacteria include pA2, pQE70, pQE60 
and pQE-9, available from Qiagen; pBS vectors, Phagescript vectors, Bluescript 
vectors, pNH8A, pNH16a, pNH18A, pNH46A, available from Stratagene; and 
ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia. 

25 Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTl and 
pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled 
artisan. 

Among known bacterial promoters suitable for use in the present 
30 invention include the £ coli lac\ and lacZ promoters, the T3 and T7 promoters, the 
gpt promoter, the lambda P R and P L promoters and the trp promoter. Suitable 
eukaryotic promoters include the CMV immediate early promoter, the HSV 
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thymidine kinase promoter, the early and late SV40 promoters, the promoters of 
retroviral LTRs, such as those of the Rous sarcoma virus (RSV), and 
metallothionein promoters, such as the mouse metallothionein-I promoter. 

Introduction of the construct into the host cell can be effected by calcium 
5 phosphate transfection, DEAE-dextran mediated transfection, cationic lipid- 
mediated transfection, electroporation, transduction, infection or other methods. 
Such methods are described in many standard laboratory manuals, such as Davis el 
a/., "Basic Methods in Molecular Biology," (1986). 

Transcription of the DNA encoding the polypeptides of the present 

10 invention by higher eukaryotes may be increased by inserting an enhancer sequence 
into the vector. Enhancers are cis-acting elements of DNA, usually about from 1 0 
to 300 bp that act to increase transcriptional activity of a promoter in a given host 
cell-type. Examples of enhancers include the SV40 enhancer, which is located on 
the late side of the replication origin at bp 100 to 270, the cytomegalovirus early 

15 promoter enhancer, the polyoma enhancer on the late side of the replication origin, 
and adenovirus enhancers. 

For secretion of the translated protein into the lumen of the endoplasmic 
reticulum, into the periplasmic space or into the extracellular environment, 
appropriate secretion signals may be incorporated into the expressed polypeptide. 

20 The signals may be endogenous to the polypeptide or they may be heterologous 
signals. 

The polypeptide may be expressed in a modified form, such as a fusion 
protein, and may include not only secretion signals, but also additional heterologous 
functional regions. Thus, for instance, a region of additional amino acids, 

25 particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence in the host cell, during purification, 
or during subsequent handling and storage. Also, peptide moieties may be added 
to the polypeptide to facilitate purification. 

The pyruvate carboxylase protein can be recovered and purified from 

30 recombinant cell cultures by well-known methods including ammonium sulfate or 
ethanol precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, 
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affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Most preferably, high performance liquid chromatography 
("HPLC") is employed for purification. 

Polypeptides of the present invention include naturally purified products, 
5 products of chemical synthetic procedures, and products produced by recombinant 
techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, 
higher plant, insect and mammalian cells. Depending upon the host employed in a 
recombinant production procedure, the polypeptides of the present invention may be 
glycosylated or may be non-glycosylated. In addition, polypeptides of the invention may 
10 also include an initial modified methionine residue, in some cases as a result of host- 
mediated processes. 



Pyruvate Carboxylase Polypeptides and Peptides 

15 The invention further provides an isolated pyruvate carboxylase 

polypeptide having the amino acid sequence encoded by the deposited DNA, or the 
amino acid sequence in Figure 1 (SEQ ID NO:2), or a peptide or polypeptide 
comprising a portion of the above polypeptides. The terms "peptide" and 
"oligopeptide" are considered synonymous (as is commonly recognized) and each 

20 term can be used interchangeably as the context requires to indicate a chain of at 
least to amino acids coupled by peptidyl linkages. The word "polypeptide" is used 
herein for chains containing more than ten amino acid residues. All oligopeptide 
and polypeptide formulas or sequences herein are written from left to right and in 
the direction from amino terminus to carboxy terminus. 

25 It will be recognized in the art that some amino acid sequence of the 

pyruvate carboxylase polypeptide can be varied without significant effect on the 
structure or function of the protein. If such differences in sequence are 
contemplated, it should be remembered that there will be critical areas on the 
protein which determine activity. In general, it is possible to replace residues which 

30 form the tertiary structure, provided that residues performing a similar function are 
used. In other instances, the type of residue may be completely unimportant if the 
alteration occurs at a non-critical region of the protein. 
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Thus, the invention farther includes variations of the pyruvate 
carboxylase polypeptide which show substantial activity or which include regions 
of pyruvate carboxylase protein such as the protein portions discussed below. Such 
mutants include deletions, insertions, inversions, repeats, and type substitutions (for 
5 example, substituting one hydrophilic residue for another, but not strongly 
hydrophilic for strongly hydrophobic as a rule). Small changes or such "neutral" 
amino acid substitutions will generally have little effect on activity. 

Typically seen as conservative substitutions are the replacements, one for 
another, among the aliphatic amino acids Ala, Val, Leu and He; interchange of the 
10 hydroxyl residues Ser and Thr, exchange of the acidic residues Asp and Glu, 
substitution between the amide residues Asn and Gin, exchange of the basic 
residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

As indicated in detail above, further guidance concerning which amino 
acid changes are likely to be phenotypically silent (i.e., are not likely to have a 
15 significant deleterious effect on a function) can be found in Bowie, J.U., et al, 
"Deciphering the Message in Protein Sequences: Tolerance to Amino Acid 
Substitutions," Science 247:1306-1310 (1990). 

The polypeptides of the present invention are preferably provided in an 
isolated form, and preferably are substantially purified. A recombinantly produced 
20 version of the pyruvate carboxylase polypeptide can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67. 31-40 (1988). 

The polypeptides of the present invention include the polypeptide 
encoded by the deposited DNA, the polypeptide of SEQ ID NO:2, as well as 
polypeptides which have at least 90% similarity, more preferably at least 95% 
25 similarity, and still more preferably at least 97%, 98% or 99% similarity to those 
described above. Further polypeptides of the present invention include polypeptides 
at least 70% identical, more preferably at least 90% or 95% identical, still more 
preferably at least 97%, 98% or 99% identical to the polypeptide encoded by the 
deposited DNA, to the polypeptide of SEQ ID NO:2, and also include portions of 
30 such polypeptides with at least 30 amino acids and more preferably at least 50 
amino acids. 

By "% similarity" for two polypeptides is intended a similarity score 
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produced by comparing the amino acid sequences of the two polypeptides using the 
Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, Madison, 
WI 5371 1) and the default settings for determining similarity. Bestfit uses the local 
5 homology algorithm of Smith and Waterman (Advances in Applied Mathematics 
2: 482-489, 1981) to find the best segment of similarity between two sequences. 

By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to a reference amino acid sequence of a pyruvate carboxylase 
polypeptide is intended that the amino acid sequence of the polypeptide is identical 

10 to the reference sequence except that the polypeptide sequence may include up to 
five amino acid alterations per each 1 00 amino acids of the reference amino acid of 
the pyruvate carboxylase polypeptide. In other words, to obtain a polypeptide 
having an amino acid sequence at least 95% identical to a reference amino acid 
sequence, up to 5% of the amino acid residues in the reference sequence may be 

15 deleted or substituted with another amino acid, or a number of amino acids up to 
5% of the total amino acid residues in the reference sequence may be inserted into 
the reference sequence. These alterations of the reference sequence may occur at 
the amino or carboxy terminal positions of the reference amino acid sequence or 
anywhere between those terminal positions, interspersed either individually among 

20 residues in the reference sequence or in one or more contiguous groups within the 
reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 
95%, 97%, 98% or 99% identical to, for instance, the amino acid sequence shown 
in Figure 1 (SEQ ID NO:2) or to the amino acid sequence encoded by deposited 

25 cosmid clone can be determined conventionally using known computer programs 
such the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for 
Unix, Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 5371 1 . When using Bestfit or any other sequence alignment program 
to determine whether a particular sequence is, for instance, 95% identical to a 

30 reference sequence according to the present invention, the parameters are set, of 
course, such that the percentage of identity is calculated over the full length of the 
reference amino acid sequence and that gaps in homology of up to 5% of the total 



WO 00/39305 PCT/US98/27301 

lo 

number of amino acid residues in the reference sequence are allowed. 



Genetic Tools for Manipulating Corynebacterium 

5 To make the genetic changes necessary for metabolic engineering in 

Corynebacterium, researchers need to be able to identify and clone the genes that are 
involved in the target pathway. They also need methods for altering these genes to affect 
the regulation or level of expression of the enzymes they encode, and for subsequently 
reintroducing the altered genes into Corynebacterium to monitor their effects on amino 

10 acid biosynthesis. Therefore, metabolic engineers must have at their disposal an array of 
plasmids that can replicate in both Corynebacterium and other, more easily manipulated 
hosts, such as E. coli. Also required are a collection of selectable markers encoding, for 
example, antibiotic resistance, well-characterized transcriptional promoters that permit 
regulation of the altered genes, and efficient transformation or conjugation systems that 

15 allow the plasmids to be inserted into the target Corynebacterium strain. 

Plasmids. Several different plasmids have been isolated and developed for the 
introduction and expression of genes in Corynebacterium (Sonnen, H., et al, Gene 
707:69-74 (1991)). The majority of these were originally identified as small (3-5 kbp), 

20 cryptic plasmids from C glutamicum, C. callunae, and C lactofermentum. They fall into 
four compatibility groups, exemplified by the plasmids pCCl, pBLl, pHM1519, and 
pG Al . Shuttle vectors, plasmids that are capable of replicating in both Corynebacterium 
and E. coli r have been developed from these cryptic plasmids by incorporating elements 
from known E. coli plasmids (particularly the ColEl origin of replication from pBR322 

25 or pUC 1 8), as well as antibiotic-resistance markers. A fifth class of plasmids that is very 
useful for manipulating Corynebacterium is based on pNG2, a plasmid originally isolated 
from Corynebacterium diphtheriae (Serwold-Davis, T.M., et al, Proc. Natl. Acad Sci. 
USA #4:4964-4968 ( 1 987)). This plasmid and its derivatives replicate efficiently in many 
species of corynebacteria. as well as in E. coli. Since the sole origin of replication in 

30 pNG2 (an element of only 1.8 kbp) functions in both the Gram-positive and Gram- 
negative host, there is no need to add an additional ColEl -type element to it. As a result, 
pNG2 derivatives (e.g., pEP2) are much smaller than other Corynebacterium shuttle 
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Selectable Markers. Several genes conferring antibiotic resistance have proven 
useful for plasmid selection and in other recombinant DNA work in corynebacteria. 
5 These include the kanamycin resistance determinant from TnPOJ. a hygromycin 
resistance marker isolated from Streptomyces hygroscopicus, a tetracycline resistance 
gene from Streptococcus faecalis, a bleomycin resistance gene from Tn5, and a 
chloramphenicol resistance marker from Streptomyces acrimycim. The p-Iactamase gene 
that is employed in many E. coli plasmids such as pBR322 does not confer ampicillin 
10 resistance in Corynebactchum. 

Transformation Systems. Several methods have been devised for introducing foreign 
DNA into Corynebacterium. The earliest method to be employed routinely was based 
on protocols that had been successful for other Gram-positive species involving 

15 incubation of spheroplasts in the presence of DNA and polyethylene glycol (Yoshihama, 
U. y etalJ. Bacteriol 762:591-597(1985)). While useful, these methods were generally 
inefficient, often yielding fewer than 10 5 transformants per milligram of DNA. 
Electroporation of Corynebacterium spheroplasts has proven to be a much more efficient 
and reliable means of transformation. Spheroplasts are generated by growing the cells 

20 in rich media containing glycine and/or low concentrations of other inhibitors of cell wall 
biosynthesis, such as isonicotinic acid hydrazide (isoniazid), ampicillin, penicillin G, or 
Tween-80. The spheroplasts are then washed in low-salt buffers containing glycerol, 
concentrated, and mixed with DNA before being subjected to electroporation. 
Efficiencies as high as 10 7 transformants per microgram of plasmid DNA have been 

25 reported with this protocol. 

A third method for DNA transfer into corynebacteria involves 
transconjugation. This method takes advantage of the promiscuity of E. coli strains 
carrying derivatives of the plasmid RP4. In E. coli, RP4 encodes many functions that 
mediate the conjugal transfer of plasmids from the host strain to other recipient strains 

30 of E. coli, or even to other species. These "tra functions'* mediate pilus formation and 
plasmid transfer. RP4 also carries an origin of transfer, or/T, a m-acting element that is 
recognized by the transfer apparatus that allows the plasmid to be conducted through the 
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pilus and into the recipient strain. From this system Simon et al (Bio/Technology 1 :784- 
791 (1985)) have developed a useful transconjugation tool that allows the transfer of 
plasmids from E. coli to Corynebacterium. They relocated the tra functions from RP4 
into the £ coli chromosome in a strain called S17-L Plasmids carrying the RP4 or/Tcan 
5 be mobilized from S 1 7- 1 into other recipients very efficiently. Although this method has 
proven useful for introducing replicating plasmids into Corynebacterium, it has proven 
even more useful for generating gene disruptions. This is accomplished by introducing 
a selectable marker into a clone of the Corynebacterium gene that is targeted for 
disruption. This construct is then ligated into an E. coli plasmid that carries the RP4 oriT 
10 but lacks an origin to support replication in Corynebacterium. SI 7-1 carrying this 
plasmid is then incubated with the recipient strain and the mixture is later transferred to 
a selective medium. Because the plasmid that was introduced is unable to replicate in 
corynebacteria, transconjugants that express the selectable marker are most likely to have 
undergone a cross-over recombination within the genomic DNA. 

15 

Restriction-Deficient Strains. Regardless of the transformation system used, there 
is clear precedent in the literature that corynebacteria are able to recognize E. co//-derived 
DNA as foreign and will most often degrade it. This ability has been attributed to the 
Corynebacterium restriction and modification system. To overcome this system, some 

20 transformation and transconjugation protocols call for briefly heating the recipient strain 
prior to transformation. The heat treatment presumably inactivates the enzymes 
responsible for the restriction system, allowing the introduced DNA to become 
established before the enzymes are turned over. Another strategy for improving the 
efficiency of DNA transfer has been to isolate Corynebacterium mutants that are deficient 

25 in the restriction system. These strains will incorporate plasmids that had been 
propagated in E. coli with almost the same efficiency as plasmids that had been 
propagated in Corynebacterium. In an alternate strategy used to circumvent the 
restriction system in Corynebacterium, Leblon and coworkers (Reyes, O., et al, Gene 
707:61-68 (1991)) developed an "integron" system for gene disruption. Integrons are 

30 DNA molecules that have the same restriction/modification properties as the target host's 
DNA , carry DNA that is homologous to a portion of the host genome (i.e., a region of the 
genome that is to be disrupted), and are unable to replicate in the host cell. A cloned gene 
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from Corynebacterium is first interrupted with a selectable marker in a plasmid that is 
propagated in one Cornynebacterium strain. This construct is then excised from the 
corynebacterial plasmid and self-ligated to form a non-replicating circular molecule. This 
"integron" is then electroporated into the restrictive host. Modification of the DNA 
5 allows the integron to elude the host restriction system, and recombination into the host 
genome permits expression of the selectable marker. 



Promoters. Reliable transcriptional promoters are required for efficient expression 
of foreign genes in Corynebacterium. For certain experiments, there is also a need for 

10 regulated promoters whose activity can be induced under specific culture conditions. 
Promoters such as the fda, thrC, and horn promoters derived from Corynebacterium 
genes have proven useful for heterologous gene expression. Inducible promoters from 
E. coli, such as P iuc9 and P lrcy which are induced by isopropylthiogalactopyranoside 
(IPTG) when the lac repressor (lacl) is present; P trp , which responds to the inducer indole 

15 acrylic acid when the trp repressor (trpR) is present; and lambda P, , which is repressed 
in the presence of the temperature-sensitive lambda repressor (cI857), have all been used 
to modulate gene expression in Corynebacterium. 



Gene Identification. With all other genetic tools in place, there still remains the 
20 challenge of identifying relevant genes from Corynebacterium. In E. coli, some of the 
resources that have been used to isolate genes are transducing phage, transposable 
elements, genetic maps of the E. coli chromosome from transduction and 
transconjugation experiments, and more recently, complete physical and sequence maps 
of the chromosome. To date, the most successful method for identifying and recovering 
25 genes from Corynebacterium has been to use Corynebacterium genomic DNA to 
complement known auxotrophs of E. coli. In this exercise, libraries of plasmids carrying 
fragments of the Corynebacterium genome are introduced into E. coli strains that are 
deficient in a particular enzyme or function. Transformants that no longer display the 
auxotrophy (e.g., homoserine deficiency) are likely to carry the complementing gene from 
30 Corynebacterium. This strategy has led to isolation of numerous Corynebacterium genes, 
including several from the pathways responsible for synthesis of aspartate-derived and 
aromatic amino acids, intermediary metabolism, and other cellular processes. One 



WO 00/39305 PCT/US98/27301 

JLIi 

limitation to this strategy is that not all genes from Corynebacterium will be expressed 
in the E. coli host. Thus, although a gene may be represented in the plasmid library, it 
may be unable to complement the E. coli mutation and therefore would not be recovered 
during selection. Overcoming this limitation, a smaller number of genes have been 
5 identified with a similar strategy in which a plasmid library from wild-type 
Corynebacterium was used to directly complement mutations in other Corynebacterium 
strains. Although this strategy avoids the concern of insufficient gene expression in the 
auxotrophic host, its utility is limited by poor plasmid-transformation efficiency in the 
auxotrophs. Still other genes have been identified by hybridization with nucleic acid 
10 probes based upon homologous genes from other species, and direct amplification of 
genes using the polymerase chain reaction and degenerate oligonucleotide primers. 



Transposable Elements. Transposable elements are extremely powerful tools in gene 
identification because they couple mutagenesis with gene recovery. Unlike classical 

1 5 mutagenesis techniques, which generate point mutations or small deletions within a gene, 
when transposable elements insert within a gene they form large disruptions, thereby 
"tagging" the altered gene for easier identification. A number of transposable elements 
have been found to transpose in Corynebacterium. Transposons found in the plasmids 
pTPIO of C xerosis and pNG2 of C diphtheriae have been shown to transpose in C 

20 glutamicum and confer resistance to erythromycin. A group from the Mitsubishi 
Chemical Company in Japan developed a series of artificial transposons from an insertion 
sequence, IS3 1831, that they discovered in C glutamicum (Vertes, A. A., et ai , Mol Gen. 
Genet 245:397-405 (1994)). After inserting a selectable marker between the inverted 
repeats of IS3 1831, these researchers were able to introduce the resulting transposon into 

25 C. glutamicum strains on an E. coli plasmid (unable to replicate in Corynebacterium) via 
electroporation. They found that the selectable marker had inserted into the genome of 
the target cell at a frequency of approximately 4 x 10 4 mutants/^g DNA. The use of such 
transposons to generate Corynebacterium auxotrophs has led to the isolation of several 
genes responsible for amino acid biosynthesis, as well as other functions in 

30 corynebacteria. 



Transducing Phage. Transducing phage have been used in other systems for mapping 
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genetic loci and for isolating genes. In 1976, researchers at Ajinomoto Co. in Japan 
surveyed 150 strains of characterized and uncharacterized strains of glutamic acid- 
producing coryneform bacteria to identify phage that might be useful for transduction 
(Mornose, H., etalj. Gen. Appl Microbiol Rev. 76:243-252 (1995)). Of 24 different 
5 phage isolates recovered from this screen, only three were able to transduce a trp marker 
from a lrp + donor to a trp recipient with any appreciable frequency, although even this 
efficiency was only 10' 7 or less. These researchers were able to improve transduction 
efficiency slightly by including 4 mM cyclic adenosine monophosphate (cAMP) or 1.2 
M magnesium chloride. Several different researchers have attempted to develop reliable 
10 transduction methods by isolating corynephages from sources such as contaminated 
industrial fermentations, soil and animal waste. Although many phage have been 
isolated and characterized, few have been associated with transduction, and an 
opportunity still exists to develop a reliable, high-efficiency transduction system for 
general use with the glutamic acid-producing bacteria. 

IS 

EXAMPLES 

The following protocols and experimental details are referenced in the examples that 
follow. 

20 

Bacterial strains and plasmids 

C glutamicum 21253 (hom\ lysine overproduce^ was used for the preparation of 
chromosomal DNA. Escherichia coli DH5a (hsdR\ recA~) (Hanahan, D., J. Mol. Biol 
25 1 66:557-580 (1 983)) was used for transformations. Plasmid pCR2. 1 TOPO (Invitrogen) 
was used for cloning polymerase chain reaction (PCR) products. The plasmid pRR850 
was constructed in this study and contained an 850-bp PCR fragment cloned in the 
pCR2.1TOPO plasmid. 
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Media and culture conditions 



PCT7US98/27301 



E. coli strains were grown in Luria-Bertani (LB) medium at 37 °C (Sambrook, J., 
et al t Molecular cloning: a laboratory manual, 2nd edn., Cold Spring Harbor 
5 Laboratory, Cold Spring Harbor, NY ( 1 989)). C. ghttamicum was grown in LB medium 
at 30 °C. Where noted, ampicillin was used at the following concentrations: 100 ng/ml 
in plates and 50 |ig/ml in liquid culture. 

DNA manipulations 

10 

Genomic DNA was isolated from C glutamicum as described by Tomioka et al 
(Tomioka,N.,<?/a/., Mol Gen. Genet. 7^:359-363(1981)). PCR fragments were cloned 
into the pCR2.1 TOPO vector following the manufacturers instructions. Cosmid and 
plasmid DNA were prepared using Qiaprep spin columns and DNA was extracted from 

IS agarose gels with the Qiaex kit (Qiagen). For large-scale high-purity preparation of 
cosmid DNA for sequencing, the Promega Wizard kit was used (Promega). Standard 
techniques were used for transformation of E. coli and agarose gel electrophoresis 
(Sambrook, J., et al y Molecular cloning: a laboratory manual, 2nd edn., Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY (1989)). Restriction enzymes were 

20 purchased from Boehringer Mannheim or New England Biolabs. 

Cosmid library 

The cosmid library used was constructed by cloning C glutamicum chromosomal 
25 DNA into the Supercos vector (Stratagene). 

Polymerase Chain Reaction (PCR) 

PCR was performed using the Boehringer Mannheim PCR core kit following the 
30 manufacturer's instructions. When PCR was performed on Corynebacterium 
chromosomal DNA, about 1 |ag DNA was used in each reaction. The forward primer 
used was 

5'GTCTTCATCGAGATGAATCCGCG3 ' and the reverse primer used was 
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5 'CGCAGCGCCACATCGTAAGTCGC3 ' for the PCR reaction. 



Dot-blot analysis 



5 Dot blots containing DNA from cosmids identified in this study and the probe as 

a positive control were prepared using the S&S (Schleicher & Schull) minifold apparatus. 
An 850-bp fragment encoding a portion of the C. ghttamicum pyruvate carboxylate gene 
was used as the probe. The probe was labeled with digoxigenin-1 1-dUTP (Boehringer 
Mannheim) in a randomly primed DNA-labeling reaction as described by the 

10 manufacturer. Hybridization, washing and colorimetric detection of the dot blots were 
done with the Genius system from Boehringer following the protocols in their user's 
guide for filter hybridization. The initial hybridization with the 29 1 cosmids was carried 
out at 65 °C overnight and washes were performed at the hybridization temperature. For 
the 1 7 cosmids that were used in the second screen, the hybridization was carried out at 

15 65 °C, but for only 8 h, and the time of exposure to the film was decreased. 



Detection of biotin-containing proteins by Western blotting 

Cell extracts from C. glutamicum were prepared as described by Jetten and Sinskey 
20 (Jetten, M.S.M.,& Sinskey, A.}., FEMS Microbiol Lett. 777:183-188(1993)). Proteins 
in cell extracts were separated in sodium dodecyl sulfate (SDS)/7.5% polyacrylamide gels 
in a BioRad mini gel apparatus and were electroblotted onto nitro-cellulose, using the 
BioRad mini transblot apparatus described by Towbin et al. (Towbin, H., et al, Proc. 
Natl Acad. ScL USA 76:4350-4354 (1979)). Biotinylated proteins were detected using 
25 avidin-conjugated alkaline phosphatase from BioRad and 5-bromo-4-chloro-3- 
indoylphosphate-p-toludine salt/nitroblue tetrazolium chloride from Schleicher & Schull. 



DNA sequencing 



30 



Automated DNA sequencing was performed by the MIT Biopolymers facility 
employing an ABI Prism 377 DNA sequencer. 
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Sequence analysis 
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The program DNA Strider Version 1.0 (Institut de Recherche Fondamentale, 
France) was used to invert, complement and translate the DNA sequence, and find open- 
5 reading frames in the sequence. The BLAST program (Altschul, S.F., et al, J, Mol Biol 
275:403-410 (1990)) from the National Center for Biotechnology Information (NCBI) 
was employed to compare protein and DNA sequences. Homology searches in proteins 
were done using the MACAW software (NCBI). PCR primers were designed with the 
aid of the Primer Premier software from Biosoft International. The compute pI/MW tool 
10 on the ExPasy molecular biology server (University of Geneva) was used to predict the 
molecular mass and pi of the deduced amino acid sequence. 

Example J: Western blotting to detect biotinylated enzymes 

15 Since pyruvate carboxylate is known to contain biotin, Western blotting was used 

to detect the production of biotinylated proteins by C. glutamicum. Two biotinylated 
proteins were detected in extracts prepared from cells grown in LB medium, (data not 
shown) consistent with previous reports. One band, located at approximately 80 kDa, has 
been identified as the biotin-carboxyl-carrier domain (BCCP) of the acetyl-CoA 

20 carboxylase (Jager, W., et al, Arch Microbiol 166:76*2 (1 996)). The second band, at 
120 kDa, is believed to be the pyruvate carboxylase enzyme, as these proteins are in the 
range 1 13-130 kDa (Attwood, P.V., Int. 1 Biochem. Cell Biol 27:231-249 (1995)). 

Example 2; PCR and cloning 

25 

C glutamicum pyruvate carboxylase gene was cloned on the basis of the homology 
of highly conserved regions in previously cloned genes. Pyruvate carboxylase genes from 
thirteen organisms were examined and primers corresponding to an ATP-binding 
submotif conserved in pyruvate carboxylases and the region close to the pyruvate-binding 
30 motif (Table 1 ) were designed. Where the amino acids were different the primers were 
designed on the basis of M tuberculosis because of its close relationship to C 
glutamicum. An 850-bp fragment was amplified from C glutamicum genomic DNA 
using the PCR and cloned in the pCR2. 1 TOPO vector of Invitrogen to construct plasmid 
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pRR850. Primers were also designed based on the conserved biotin-binding site and 
pyruvate-binding site (data not shown). 

Example 3: Isolating a cosmid containing the C. glutamicum pyruvate carboxylase 
5 gene 

The 850-base-pair fragment containing a portion of the C glutamicum pyruvate 
carboxylase gene was used to probe a C. glutamicum genomic library. In the first round 
of screening, 17 out of 291 cosmids in a dot blot appeared positive. A second round of 

10 screening was performed on these 1 7 cosmids, using the same probe but more stringent 
hybridization conditions, yielding four cosmids with a positive signal. To confirm that 
these cosmids indeed contained the pyruvate carboxylase gene, PCR was performed using 
the four positive cosmids as templates and the same primers used to make the probe. An 
850-bp fragment was amplified from all four positive cosmids, designated IIIF10, IIE9, 

15 IIIG7 and IIIB7. 



Organism 


Conserved 


Conserved 




region A 


region B 


Caenorhabditis elegans 


YFIEVNAR 


ATFDVSM 


Aedes aegypti 


YFIEVNAR 


ATFDVAL 


Mycobacterium tuberculosis 


VFIEMNPR 


ATYDVAL 


Bacillus stear other mophilus 


YFIEVNPR 


ATFDVAY 


Pichia pastoris 


YFIEINPR 


ATFDVSM 


Mus muscuius 


YFIEVMSR 


ATFDVAM 


Rat tux norvegicus 


YFIEVNSR 


ATFDVAM 


Saccharomyces cerevisiae 1 


YFIEINPR 


ATFDVAM 


Saccharomyces cerevisiae 2 


YFIEINPR 


ATFDVAM 


Rhizabium etli 


YFIEVNPR 


ATFDVSM 


Homo sapiens 


YFIEVNSR 


ATFDVAM 


Schizosaccharomyces pombe 


YFIEINPR 


ATFDVSM 



Table 1 Pyruvate carboxylase sequences from 13 organisms (obtained from GenBank) 
were aligned using the MACAW software. Two highly conserved regions were selected 
and ol igonucleotide primers were designed on the basis of the Mycobacterium tuberculosis 
DNA sequence corresponding to these regions. The forward primer was based on the 
DNA sequence corresponding to conserved region A and the reverse primer was based on 
the DNA sequence corresponding to conserved region B. 
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The 850-bp insert of plasmid pRR850 was sequenced using the M13 forward and 
M 1 3 reverse primers. On the basis of this sequence, primers Begrev 1 and Endfor 1 were 
5 designed and used to sequence outwards from the beginning and the end of the 850-bp 
portion of the pyruvate carboxylase gene. Cosmid III F10 was used as the sequencing 
template. The sequencing was continued by designing new primers (Table 2) and 
"walking" across the gene. 

10 Example 5: Sequence analysis 

3637 bp of cosmid III F10 were sequenced. A 3420-bp open reading frame was 
identified, which is predicted to encode a protein of 1 140 amino acids. The deduced 
protein is 63% identical to M tuberculosis pyruvate carboxylase and 44% identical to 

15 human pyruvate carboxylase, and the C. glutamicum gene pc was named on the basis of 
this homology. The deduced protein has a predicted pi of 5.4 and molecular mass of 
123.6 kDa, which is similar to the subunit molecular mass of 120 kDa estimated by 
SDS/polyacrylamide gel electrophoresis. Upstream of the starting methionine there 
appears to be a consensus ribosome binding-site AAGGAA. The predicted translational 

20 start site, based on homology to the M. tuberculosis sequence, is a GTG codon, as has 
been observed in other bacterial sequences (Stryer, L., Biochemistry, 3rd edn., Freeman, 
NY (1988); Keilhauer, C, et al, J. Bacteriol 77.5:5595-5603 (1993)). The DNA 
sequence has been submitted to GenBank and has been assigned the accession number 
AF038548. 

25 The amino-terminal segment of the C. glutamicum pyruvate carboxylase contains 

the hexapeptide GGGGRG, which matches the GGGG(R/K)G sequence that is found in 
all biotin-binding proteins and is believed to be an ATP-binding site (Fry, D.C., et al, 
Proc. Natl. Acad Sci. USA 53:907-911 (1986); Post, L.E., et al } J. Biol Chem. 
265:1142-1141 ( 1 990)). A second region that is proposed to be involved in ATP binding 

30 and is present in biotin-dependent carboxylases and carbamyphosphate synthetase (Lim, 
F., et al, J. Biol Chem. 263:] 1493-1 1497 (1988)) is conserved in the C glutamicum 
sequence. The predicted C. glutamicum pyruvate carboxylase protein also contains a 
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putative pyruvate-binding motif, FLFEDPWDR, which is conserved in the 
transcarboxylase domains of Mycobacterium, Rhizobhim and human pyruvate 
carboxylases (Dunn, M.F., et al., J. Bacteriol. 775:5960-5970 (1996)). Tryptophan 
fluorescence studies with transcarboxylase have shown that the Trp residue present in this 
5 motif is involved in pyruvate binding (Kumer, G.K., et al, Biochemistry 27:5978-5983 
(1 988)). The carboxy-terminal segment of the enzyme contains a putative biotin-binding 
site, AMKM, which is identical to those found in other pyruvate carboxylases as well as 
the biotin-carboxyl-carrier protein (BCCP) domains of other biotin-dependent enzymes. 



10 



15 



Primer name 


Primer sequence (5 '-3') 


Begrevl 


TTCACCAGGTCCACCTCG 


Endforl 


CGTCGCAAAGCTGACTCC 


Begrev2 


GATGCTTCTGTTGCTAATTTGC 


Endfor2 


GGCCATTAAGGATATGGCTG 


Begrev3 


GCGGTGGAATGATCCCCGA 


Endfor3 


ACCGCACTGGGCCTTGCG 


Endfor4 


TCGCCGCTTCGGCAACAC 



Table 2 DNA sequences of the primers used to obtain the sequence of the pyruvate 
20 carboxylase gene in the cosmid IIIF10 

Previous studies have shown that phosphoera/pyruvate carboxylase (ppc) is not the 
main anaplerotic enzyme for C glutamicum, since its absence does not affect lysine 
production (Gubler, M., et al, Appl Microbiol Biotechnoi 40:857-863 (1994); Peters- 

25 Wendisch, P.G., et al, Microbiol Lett. J J 2:269-274 (1993)). Moreover, a number of 
studies have indicated the presence of a pyruvate-carboxylating enzyme, employing Re- 
labeling experiments and NMR and GC-MS analysis (Park, S.M., et al, Applied 
Microbiol Biotechnoi 47:430-440 (1997b); Peters- Wendisch, P.G., et al, Arch. 
Microbiol 1 £5:387-396 (1 996)), or enzymatic assays with cell free extracts (Tosaka, O., 

30 Agric. Biol Chem. 43:1513-1519 (1979)) and permeable cells (Peters- Wendisch, P.G., 
et al, Microbiol 743:1095-1 103 (1997)). Very low pyruvate carboxylation activity were 
detected in cell-free extracts, but this activity was not uncoupled from a very high ATP 
background. It is highly probable that the activity measured is due to reversible 
gluconeogenic enzymes, such as oxaloacetate decarboxylase and malic enzyme. The 

35 presence of pyruvate carboxylase in C glutamicum makes it highly unlikely that the 
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gluconeogenic enzymes mentioned above can serve the anaplerotic needs of this strain. 

The deduced amino acid sequence of the C glutamicum pyruvate carboxylase gene 
has significant similarity to the pyruvate carboxylase sequences from a diverse group of 
organisms. It contains a biotin carboxylase domain in its N-terminal region, a BCCP 
5 domain in its C-terminal region, and a transcarboxylase domain with a binding site 
specific for pyruvate in its central region. The C. glutamicum pyruvate carboxylase 
protein showed strong homology to M. tuberculosis and the human pyruvate carboxylase 
(Wexler, I.D., et al, Biochim. Biophys. Acta 7227:46-52 (1 994)). 

There are precedents to finding that C glutamicum contains more than one enzyme 
10 to perform the anaplerotic function of regenerating oxaloacetate. Pseudomonas 
citronellolis, Pseudomonas fluorscens, Azotobactervinelandii and Thiobacillus novellus 
contain both ppc and pyruvate carboxylase (O'Brien, R.W., et al, J. Biol Chem. 
252:1257-1263 (1977); Scrutton, M.C. and Taylor, B.L., Arch. Biochem. Biophys. 
764:641-654(1974); MilraddeForchetti,S.R.,&Cazullo, JJ.,7. Gen. Microbiol 93:75- 
15 81 (1976); Charles, A.M, & Wilier, D.W., Can. J. Microbiol 30:532-539 (1984)). Zea 
mays contains three isozymes of ppc (Toh, H., et al, Plant Cell Environ 77:31-43 
(1994)) and Saccharomyces cerevisiae contains two isozymes of pyruvate carboxylase 
(Brewster, N.K., et al, Arch. Biochem. Biophys. 377:62-71 (1994)), each differentially 
regulated. With the present discovery of the existence of a pyruvate carboxylase gene in 
20 C. glutamicum, the number of enzymes that can interconvert phosphoewo/pyruvate (PEP), 
oxaloacetate and pyruvate in this strain rises to six. This presence of all six enzymes in 
one organism has not been reported previously. P. citronettolis contains a set of five 
enzymes that interconvert oxaloacetate, PEP and pyruvate, namely pyruvate kinase, PEP 
synthetase, PEP carboxylase, oxaloacetate decarboxylase and pyruvate carboxylase 
25 (O'Brien, R.W., et al, J. Biol Chem. 252:\2Sl-\263 (1977)). Azotobacter contains all 
of the above enzymes except PEP synthetase (Scrutton, M.C, & Taylor, B.L., Arch. 
Biochem. Biophys. 764:641-654(1974)). 

The presence in C glutamicum of the six metabolically related enzymes suggests 
that the regulation of these enzymes through effectors is important. Biochemical and 
30 genetic study of all six enzymes in coordination with other downstream activities may 
lead to the elucidation of the exact procedures necessary for maximizing the production 
of primary metabolites by this industrially important organism. 
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Example 6: Construction of a pyruvate carboxylase mutant 

The entire reading frame from nucleotide 180 to nucleotide 3630 of the pyruvate 
carboxylase DNA was amplified using PCR. The oligonucleotide primers used for the 
5 PCR were designed to remove the Sail site within the coding sequence by silent 
mutagenesis and introduce EcoRV and Sail sites upstream and downstream, respectively, 
of the open reading frame. The PCR product was digested with EcoRV and Sail and 
cloned into the vector pBluescript. The resulting plasmid is pPCBluescript. To obtain 
a plasmid-bonie disruption of pyc, a derivative of pPCBluescript was constructed in 

10 which the middle portion of the pyc gene was deleted and replaced with the tsr gene, 
which encodes resistance to the antibiotic thiostrepton. The RP4 mob element was then 
inserted into the plasmid, yielding pAL240. This plasmid can be conjugally transferred 
into Corynebacterium, but it is then unable to replicate because it has only a ColE 1 origin 
of replication. pAL240 was transferred from E. coli SI 7-1 into C glutamicum via 

15 transconjugation, and transconjugants were selected on medium containing thiostrepton 
and nalidixic acid. 

After the drug resistance phenotype of each transconjugant was confirmed, the 
transconjugants were tested for their ability to grow on different carbon sources. Because 
pAL240 cannot replicate in C. glutamicum, the only cells which will survive should be 

20 those whose genomes have undergone recombination with the plasmid. Several 
candidates were identified with the proper set of phenotypes: they are resistant to 
thiostrepton and nalidixic acid, grow well on minimal plates containing glucose or acetate 
as the sole carbon source, and grow poorly or not at all on minimal plates containing 
lactate as the sole carbon source. Southern hybridization and PCR-based assays are used 

25 to confirm whether there is only one copy of the pyruvate carboxylase gene in the 
genome and that it is disrupted with the thiostrepton resistance marker. Lysine 
production and the production of biotinylated proteins by this strain is examined, and the 
Apyc strain as a negative control in activity assays and as a host strain for 
complementation tests. 

30 
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Example 7: Development of an overexpressing strain 
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In order to test the hypothesis that increased levels of pyruvate carboxylase 
will lead to increased production of lysine, it is necessary to construct strains in which 
5 expression of the pyruvate carboxylase gene is under the control of an inducible 
promoter. 

The vector pAPE 12, which has the NG2 origin of replication and a multiple 
cloning site downstream of the IPTG-controlled trc promoter, was used as an expression 
vector in C. glutamicum. A derivative of pAPE12 was constructed which contained the 

10 pyruvate carboxylase gene downstream of Ptrc. The pyc gene was excised from 
pPCBluescript usingSaR andXbal and ligated into pAPE 12 which had been cleaved with 
the same enzymes, forming pLW305. The pyruvate carboxylase gene present in 
PCBluescript (and hence in pLW305) has the wild type GTG start codon, and the Sail 
restriction site present near the 5' end of the wild type gene was eliminated by the 

15 introduction of a one base silent mutation during amplification of the pyruvate 
carboxylase gene. 

pLW305 and pAPE12 was electroporated into several other 
Corynebacterium genetic backgrounds. 

Because the pyruvate carboxylase gene in pLW305 has a GTG start codon 
20 and carries some intervening DNA between the trc promoter and the start codon, a 
pyruvate carboxylase overexpression plasmid, pXL 1 , was designed that eliminates those 
shortcomings. The 5' end of the gene was amplified from pLW305 with oligonucleotide 
primers that simultaneously change the GTG start codon to ATG and introduce a 
BspLU 1 1 -I restriction site, which is compatible with Ncol. The PCR product was then 
25 cut with BspUJl 1 -I and Afel, and ligated into the 7.5 kb backbone obtained by partial 
digest of pLW305 with Ncol followed by complete cutting with Afel. Two independent 
sets of ligations and transformations have yielded putative pXLl clones. 

Example 8: Fermentation results 

30 

It has been shown that the level of pyruvate carboxylase activity varies 
greatly with the carbon source used when the gene is expressed from its native C. 



WO 00/39305 33 PCT/US98/27301 

glutamicum promoter. Therefore, production of pyruvate carboxylase in strains grown 
on these carbon sources was examined. 

The strains NRRL B-11474, NRRL B-11474 (pLW305), and NRRL B- 
1 1 474 Apyc candidate 3 5 were cultured in flasks on minimal medium for NRRL B- 1 1 474 
5 with two different sources of carbon: glucose or lactate. The results on growth and amino 
acid production are presented below. 





glucose 


lactate 




biomass 
(g/0 


lysine 
(g/1) 


Y lys/glc 
(g/g) 


biomass 
(g/1) 


lysine 
(g/1) 


Y lys/lac 
(g/g) 


NRRL fi- 
ll 474 


6.7 ± 
0.2 


5.0 ±0.7 


0.21 


3 


1.7 


0.12 


NRRL B- 

11474 

(pLW305) 


7.3 ± 
0.2 


5.3 ± 0.2 


0.22 


4 


2.5 


0.15 


Apyc #35 


1.1 


0 


0 


0 


0 


0 



NRRL B-l 1474 and pLW305 show the same behavior on glucose. Both 
strains produce the same amount of biomass and lysine. On lactate the strains also have 
similar yield of lysine. NRRL B-l 1474 (pLW305) consumed all of the lactate in the 

20 medium ( 1 7g/l) whereas the wild type NRRL B- 1 1 474 consumed 40% less lactate during 
the same period of time. The NRRL B- 1 1 474 was calculated to consume lactate at a rate 
of 0.37 g lactate/hour, whereas the NRRL B-11474 (pLW305) strain consumed this 
substrate at a rate of 0.65 g lactate/hour. 

The NRRL B- 1 1474 Apyc doesn't grow on lactate, which is consistent with 

25 the expected phenotype. Its growth on glucose is very low and the strain does not 
produce lysine. Kinetic studies are conducted to characterize further the behavior of 
these strains. 

Example 9: Visualization of biotinylated proteins 

30 

Pyruvate carboxylase contains biotin. Therefore, it should be possible to 
detect the accumulation of this enzyme by monitoring the appearance of specific 
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biotinylated products in cells. 
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Example 10: Electrophoretic gels 

5 To detect biotinylated proteins in electrophoretic gels, a commercially 

available streptavidin linked to alkaline phosphatase was used. Crude protein lysates 
from induced and uninduced cultures of E. colt DH5cc or NRRL B- 11474 harboring 
pAPE12 or pLW305 and separated the proteins on duplicate 7.5% polyacrylamide 
denaturing electrophoretic gels. One gel of each pair is stained with Coomassie Brilliant 

10 Blue to visualize all proteins and ensure equal levels of protein were loaded in each lane. 
The other gels are treated with the streptavidin-alkaline phosphatase reagent, which binds 
to biotinylated proteins. The location of these proteins can then be visualized by 
providing alkaline phosphatase with acolorimetric substrate, 5-bromo-4-chloro-3-indolyl 
phosphate (BCIP). As reported by others, two major biotinylated proteins were detected. 

15 The higher molecular weight species (approx. 120 kDa) has been shown to be pyruvate 
carboxylase, and the lower molecular weight species (approx. 60 kDa) is the biotinylated 
subunit of acetyl-CoA carboxylase. 

20 All publications mentioned hereinabove are hereby incorporated in their 

entirety by reference. 

While the foregoing invention has been described in some detail for 
purposes of clarity and understanding, it well be appreciated by one skilled in the art from 
a reading of this disclosure that various changes in form and detail can be made without 

25 departing from the true scope of the invention and appended claims. 
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WHAT IS CLAIMED IS: 



1 . An isolated nucleic acid molecule comprising a polynucleotide having 
5 a nucleotide sequence at least 95% identical to a sequence selected from the 

group consisting of: 

(a) a nucleotide sequence encoding the pyruvate carboxylase 
polypeptide having the amino acid sequence in SEQ ID NO:2; 

(b) a nucleotide sequence encoding the pyruvate carboxylase 
10 polypeptide having the complete amino acid sequence encoded by the cosmid 

clone contained in ATCC Deposit No. ; and 

(c) a nucleotide sequence complementary to any of the nucleotide 
sequences in (a) or (b). 

15 2 - The nucleic acid molecule of claim 1 wherein said polynucleotide has 

the complete nucleotide sequence in SEQ ID NO:l. 

3. The nucleic acid molecule of claim 1 wherein said polynucleotide has 
the nucleotide sequence in SEQ ID NO:l encoding the pyruvate carboxylase 

20 polypeptide having the amino acid sequence in SEQ ID NO:2. 

4. The nucleic acid molecule of claim 1 wherein said polynucleotide has 
the nucleotide sequence encoding the pyruvate carboxylase polypeptide having 
the complete amino acid sequence encoded by the cosmid clone contained in 

25 ATCC Deposit No. . 



5. An isolated nucleic acid molecule comprising a polynucleotide which 
hybridizes under stringent hybridization conditions to a polynucleotide having 
a nucleotide sequence identical to a nucleotide sequence in (a), (b) or (c) of claim 
1 wherein said polynucleotide which hybridizes does not hybridize under 
stringent hybridization conditions to a polynucleotide having a nucleotide 
sequence consisting of only A residues or of only T residues. 
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6. The isolated nucleic acid molecule of claim 1, wherein said 
polynucleotide is DNA. 

5 7. The isolated nucleic acid molecule of claim 1, wherein said 

polynucleotide is RNA. 

8. A method for making a recombinant vector comprising inserting an 
isolated nucleic acid molecule of claim 1 into a vector. 



10 



15 



9. A recombinant vector produced by the method of claim 8. 

1 0. A method of making a recombinant host cell comprising introducing the 
recombinant vector of claim 9 into a host cell. 

11. A recombinant host cell produced by the method of claim 1 0. 



12. A recombinant method for producing a pyruvate carboxylase 
polypeptide, comprising culturing the recombinant host cell of claim 1 1 under 
20 conditions such that said polypeptide is expressed and recovering said 

polypeptide. 



13. The method of claim 12, wherein said pyruvate carboxylase is expressed 
2 to 20 fold higher than its expression in Corynebacterium glutamicum. 

25 

14. An isolated pyruvate carboxylase polypeptide having an amino acid 
sequence at least 95% identical to a sequence selected from the group consisting 
of: 

(a) the amino acid sequence of the pyruvate carboxylase polypeptide 
30 having the complete amino acid sequence in SEQ ID NO:2; 

(b) the amino acid sequence of the pyruvate carboxylase polypeptide 
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having the complete amino acid sequence encoded by the cosmid clone 
contained in ATCC Deposit No. ; and 

15. A method of making amino acids comprising expressing the nucleotide 
I sequence of claim 1 and recovering said amino acids. 

16. The method of claim 15, wherein said amino acid is lysine. 

1 7. The method of claim 1 5, wherein pyruvate carboxylase is expressed 2 to 20 
fold higher than in Corynebacterium glutamicum. 
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TGGGGCGGGGTTAGATCCTGGGGGGTTTATTTCATTCAC 
TTTGGCTTGAAGTCGTGCAGGTCAGGGGAGTGTTGCCCGAAAACA 
TTGAGAGGAAAACAAAAACCGATGTTTGATTGGGGGAATCGGGGG 
TTACGATACTAGGACGCAGTGACTGCTATCACCCTTGGCGGTCTC 
1 75 TTGTTGAAAGGAATAATTACTCTAGTGTCGACTCACACATCTTCA 

M S T H T S S 
220 ACGCTTCCAGCATTCAAAAAGATCTTGGTAGCAAACCGCGGCGAA 

TLPAFKKILVANRGE 
265 ATCGCGGTCCGTGCTTTCCGTGCAGCACTCGAAACCGGTGCAGCC 

IAVRAFRAALETGAA 
310 ACGGTAGCTATTTACCCCCGTGAAGATCGGGGATCATTCCACCGC 

TVAIYPREDRGSFHR 
355 TCTTTTGCTTCTGAAGCTGTCCGCATTGGTACCGAAGGCTCACCA 

SFASEAVRIGTEGSP 
400 GTCAAGGCGTACCTGGACATCGATGAAATTATCGGTGCAGCTAAA 

VKAYLDIDEIIGAAK 
445 AAAGTTAAAGCAGATGCCATTTACCCGGGATACGGCTTCCTGTCT 

KVKADAIYPGYGFLS 
490 GAAAATGCCCAGCTTGCCCGCGAGTGTGCGGAAAACGGCATTACT 

ENAQLARECAENG I T 
535 TTTATTGGCCCAACCCCAGAGGTTCTTGATCTCACCGGTGATAAG 

FIGPTPEVLDLTGDK 
580 TCTCGCGCGGTAACCGCCGCGAAGAAGGCTGGTCTGCCAGTTTTG 

SRAVTAAKKAGLPVL 
625 GCGGAATCCACCCCGAGCAAAAACATCGATGAGATCGTTAAAAGC 

AESTPSKNIDEIVKS 
670 GCTGAAGGCCAGACTTACCCCATCTTTGTGAAGGCAGTTGCCGGT 

AEGQTYP IFVKAVAG 
715 GGTGGCGGACGCGGTATGCGTTTTGTTGCTTCACCTGATGAGCTT 

GGGRGMRFVASPDEL 
760 CGCAAATTAGCAACAGAAGCATCTCGTGAAGCTGAAGCGGCTTTC 

RKLATEASREAEAAF 
805 GGCGATGGCGCGGTATATGTCGAACGTGCTGTGATTAACCCTCAG 

GDGAVYVERAVI NPQ 
850 CATATTGAAGTGCAGATCCTTGGCGATCACACTGGAGAAGTTGTA 

HIEVQILGDHTGEVV 
895 CACCTTTATGAACGTGACTGCTCACTGCAGCGTCGTCACCAAAAA 

H L Y E R D C S L Q R R H Q K 
940 GTTGTCGAAATTGCGCCAGCACAGCATTTGGATCCAGAACTGCGT 
VVEIAPAQHLDPELR 

FIG. 1 A 
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985 GATCGCATTTGTGCGGATGCAGTAAAGTTCTGCCGCTCCATTGGT 

DRICADAVKFCRSIG 
1 030 TACCAGGGCGCGGGAACCGTGGAATTCTTGGTCGATGAAAAGGGC 

YQGAGTVEFLVDEKG 
1 075 AACCACGTCTTCATCGAAATGAACCCACGTATCCAGGTTGAGCAC 

NHVFIEMNPRIQVEH 
1120 ACCGTGACTGAAGAAGTCACCGAGGTGGACCTGGTGAAGGCGCAG 

TVTEEVTEVDLVKAQ 
1 165 ATGCGCTTGGCTGCTGGTGCAACCTTGAAGGAATTGGGTCTGACC 

MRLAAGATLKELGLT 
1210 CAAGATAAGATCAAGACCCACGGTGCAGCACTGCAGTGCCGCATC 

QDKIKTHGAALQCRI 
1255 ACCACGGAAGATCCAAACAACGGCTTCCGCCCAGATACCGGAACT 

TTEDPNNGFRPDTGT 
1 300 ATCACCGCGTACCGCTCACCAGGCGGAGCTGGCGTTCGTCTTGAC 

ITAYRSPGGAGVRLD 
1345 GGTGCAGCTCAGCTCGGTGGCGAMTCACCGCACACTTTGACTCC 

GAAQLGGEITAHFDS 
1 390 ATGCTGGTGAAAATGACCTGCCGTGGTTCCGACTTTGAAACTGCT 

MLVKMTCRGSDFETA 
1435 GTTGCTCGTGCACAGCGCGCGTTGGCTGAGTTCACCGTGTCTGGT 

VARAQRALAEFTVSG 
1 480 GTTGCAACCAACATTGGTTTCTTGCGTGCGTTGCTGCGGGAAGAG 

VATN IGFLRALLREE 
1525 GACTTCACTTCCAAGCGCATCGCCACCGGATTCATTGCCGATCAC 

D FTSKRIATGFIADH 
1 570 CCGCACCTCCTTCAGGCTCCACCTGCTGATGATGAGCAGGGACGC 

PHLLQAPPADDEQGR 
1615 ATCCTGGATTACTTGGCAGATGTCACCGTGAACAAGCCTCATGGT 

ILDYLADVTVNKPHG 
1660 GTGCGTCCAAAGGATGTTGCAGCTCCTATCGATAAGCTGCCTAAC 

VRPKDVAAPIDKLPN 
1 705 ATCAAGGATCTGCCACTGCCACGCGGTTCCCGTGACCGCCTGAAG 

IKDLPLPRGSRDRLK 
1 750 CAGCTTGGCCCAGCCGCGTTTGCTCGTGATCTCCGTGAGCAGGAC 

OLGPAAFARDLREQD 
1 795 GCACTGGCAGTTACTGATACCACCTTCCGCGATGCACACCAGTCT 
ALAVTDTTFRDAHQS 

FIG. IB 
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1840 TTGCTTGCGACCCGAGTCCGCTCATTCGCACTGAAGCCTGCGGCA 

LLATRVRSFALKPAA 
1 885 GAGGCCGTCGCAAAGCTGACTCCTGAGCTTTTGTCCGTGGAGGCC 

EAVAKLTPELLSVEA 
1 930 TGGGGCGGCGCGACCTACGATGTGGCGATGCGTTTCCTCTTTGAG 

WGGATYDVAMRFLFE 
1 975 GATCCGTGGGACAGGCTCGACGAGCTGCGGGAGGCGATGCCGAAT 

DPWDRLDELREAMPN 
2020 GTAAACATTCAGATGCTGCTTCGCGGCCGCAACACCGTGGGATAC 

VNIQMLLRGRNTVGY 
2065 ACCCCGTACCCAGACTCCGTCTGCCGCGCGTTTGTTAAGGAAGCT 

TPYPDSVCRAFVKEA 
2110 GCCAGCTCCGGCGTGGACATCTTCCGCATCTTCGACGCGCTTAAC 

ASSGVDIFRIFDALN 
2155 GACGTCTCCCAGATGCGTCCAGCAATCGACGCAGTCCTGGAGACC 

DVSQMRPAIDAVLET 
2200 AACACCGCGGTAGCCGAGGTGGCTATGGCTTATTCTGGTGATCTC 

NTAVAEVAMAYSGDL 
2245 TCTGATCCAAATGAAAAGCTCTACACCCTGGATTACTACCTAAAG 

SDPNEKLYTLDYYLK 
2290 ATGGCAGAGGAGATCGTCAAGTCTGGCGCTCACATCTTGGCCATT 

MAEEIVKSGAHILAI 
2335 AAGGATATGGCTGGTCTGCTTCGCCCAGCTGCGGTAACCAAGCTG 

KDMAGLLRPAAVTKL 
2380 GTCACCGCACTGCGCCGTGAATTCGATCTGCCAGTGCACGTGCAC 

VTALRREFDLPVHVH 
2425 ACCCACGACACTGCGGGTGGCCAGCTGGCAACCTACTTTGCTGCA 

THDTAGGQLATYFAA 
2470 GCTCAAGCTGGTGCAGATGCTGTTGACGGTGCTTCCGCACCACTG 

AQAGADAVDGASAPL 
251 5 TCTGGCACCACCTCCCAGCCATCCCTGTCTGCCATTGTTGCTGCA 

SGTTSQPSLSAIVAA 
2560 TTCGCGCACACCCGTCGCGATACCGGTTTGAGCCTCGAGGCTGTT 

FAHTRRDTGLSLEAV 
2605 TCTGACCTCGAGCCGTACTGGGAAGCAGTGCGCGGACTGTACCTG 

SDLEPYWEAVRGLYL 
2650 CCATTTGAGTCTGGAACCCCAGGCCCAACCGGTCGCGTCTACCGC 

PFESGTPGPTGRVYR 
2695 CACGAAATCCCAGGCGGACAGTTGTCCAACCTGCGTGCACAGGCC 
HE I PGGQLSNLRAQA 

FIG.1C 
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2740 ACCGCACTGGGCCTTGCGGATCGTTTCGAACTCATCGAAGACAAC 

TALGLADRFELIEDN 
2785 TACGCAGCCGTTAATGAGATGCTGGGACGCCCAACCAAGGTCACC 

YAAVNEMLGRPTKVT 
2830 CCATCCTCCAAGGTTGTTGGCGACCTCGCACTCCACCTCGTTGGT 

PSSKVVGDLALHLVG 
2875 GCGGGTGTGGATCCAGCAGACTTTGCTGCCGATCCACAAAAGTAC 

AGVDPADFAADPQKY 
2920 GACATCCCAGACTCTGTCATCGCGTTCCTGCGCGGCGAGCTTGGT 

DIPDSVIAFLRGELG 
2965 AACCCTCCAGGTGGCTGGCCAGAGCCACTGCGCACCCGCGCACTG 

NPPGGWPEPLRTRAL 
3010 GAAGGCCGCTCCGAAGGCAAGGCACCTCTGACGGAAGTTCCTGAG 

EGRSEGKAPLTEVPE 
3055 GAAGAGCAGGCGCACCTCGACGCTGATGATTCCAAGGAACGTCGC 

EEQAHLDADDSKERR 
3100 AATAGCCTCAACCGCCTGCTGTTCCCGAAGCCAACCGAAGAGTTC 

NSLNRLLFPKPTEEF 
3145 CTCGAGCACCGTCGCCGCTTCGGCAACACCTCTGCGCTGGATGAT 

LEHRRRFGNTSALDD 
3190 CGTGAATTCTTCTACGGCCTGGTCGAAGGCCGCGAGACTTTGATC 

REFFYGLVEGRETL I 
3235 CGCCTGCCAGATGTGCGCACCCCACTGCTTGTTCGCCTGGATGCG 

RLPDVRTPLLVRLDA 
3280 ATCTCTGAGCCAGACGATAAGGGTATGCGCAATGTTGTGGCCAAC 

ISEPDDKGMRNVVAN 
3325 GTCAACGGCCAGATCCGCCCAATGCGTGTGCGTGACCGCTCCGTT 

VNGQIRPMRVRDRSV 
3370 GAGTCTGTCACCGCAACCGCAGAAAAGGCAGATTCCTCCAACAAG 

ESVTATAEKADSSNK 
3415 GGCCATGTTGCTGCACCATTCGCTGGTGTTGTCACCGTGACTGTT 

GHVAAPFAGVVTVTV 
3460 GCTGAAGGTGATGAGGTCAAGGCTGGAGATGCAGTCGCAATCATC 

AEGDEVKAGDAVA I I 
3505 GAGGCTATGAAGATGGAAGCAACAATCACTGCTTCTGTTGACGGC 

EAMKMEAT I TASVDG 
3550 AAAATCGATCGCGTTGTGGTTCCTGCTGCAACGAAGGTGGAAGGT 

KIDRVVVPAATKVEG 
3595 GGCGACTTGATCGTCGTCGTTTCCTAA 3621 
G D L I V V V S * 
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